Minimal Landmarks for Optimal Delete-Free Planning

نویسندگان

  • Patrik Haslum
  • John K. Slaney
  • Sylvie Thiébaux
چکیده

We present a simple and efficient algorithm to solve deletefree planning problems optimally and calculate the h+ heuristic. The algorithm efficiently computes a minimum-cost hitting set for a complete set of disjunctive action landmarks generated on the fly. Unlike other recent approaches, the landmarks it generates are guaranteed to be set-inclusion minimal. In almost all delete-relaxed IPC domains, this leads to a significant coverage and runtime improvement. Introduction Problems without delete lists play an important role in the planning literature. Most planning heuristics, including the admissible heuristics hmax, additive hmax, LM-cut, and variants (Bonet and Geffner 2001; Haslum, Bonet, and Geffner 2005; Coles et al. 2008; Helmert and Domshlak 2009; Bonet and Helmert 2010) are based on the delete relaxation of the planning problem which ignores delete lists. These heuristics attempt to find good lower bounds on the cost h+ of the optimal delete-relaxed plan, which is NP-equivalent to compute and hard to approximate (Bylander 1994; Betz and Helmert 2009). Hence practical methods for computing h+ are crucial to assess how close such planning heuristics are able to get to the holy grail they seek. Delete-free problems of interest in their own right are also starting to appear. An example originating in systems biology is the minimal seed-set problem (Gefen and Brafman 2011), which challenges both classical optimisation methods and optimal planners. Moreover, recent work on translating NP problems into an NP fragment of classical planning opens the possibility that optimal delete-free planning could directly be used to solve a wide range of NP-hard problems (Porco, Machado, and Bonet 2011). Hence practical methods for solving delete-free problems are also important to extend the reach of planning technology. A possible approach to solving such problems is to treat them as any other planning problem and use any costoptimal planning algorithm. However, Helmert and Domshlak (2009, Table 1) show that this approach, or even resorting to domain-specific procedures, still leaves many cases where h+ cannot be computed. Besides two papers in this conference (Gefen and Brafman 2012; Pommerening and Helmert 2012), there is no other published work on optimal planners designed specifically for general delete-free planning problems. Bonet and Helmert (2010) established that Copyright c © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. determining h+ amounts to computing a minimum-cost hitting set for a complete set of action landmarks for the relaxed problem, and Bonet and Castillo (2011) described an algorithm for computing such a complete set on the fly. However, neither tried to compute h+, for fear that solving the general hitting set problem optimally would be too hard, settling instead for computing improvements on LM-cut. Bonet and Castillo (2011) state that “in the future [they] would like to like to develop an effective algorithm for computing exact h+ values”. This is exactly what we do here. We present a simple algorithm which efficiently computes a minimum-cost hitting set of a complete set of landmarks generated on the fly. Unlike in previous work, the landmarks it generates are set-inclusion minimal. We show that this results in simpler hitting set problems and significant improvements in coverage and runtime. Background We adopt the standard definition of a propositional STRIPS planning problem, without negation in action preconditions or the goal (see, e.g., Ghallab, Nau, and Traverso 2004, chapter 2). Each action a has a non-negative cost, cost(a), and the cost of a plan is the sum of the cost of actions in it. The initial state is denoted by sI and the goal by G. The Delete Relaxation The delete relaxation P of a planning problem P is a problem exactly like P except that del(a) = ∅ for each a, i.e., no action makes any atom false. The delete relaxation heuristic, h+ is defined as the minimum cost of any plan for P. Let A be a set of actions in P . We denote by R(A) the set of all atoms that are reachable in P, starting from the initial state, using only actions in A. We say a set of actions A is a relaxed plan iff the goal G ⊆ R(A). An actual plan for the delete-relaxed problem is of course a sequence of actions. That G ⊆ R(A) means there is at least one sequencing of the actions in A that reaches a goal state. Such a sequencing can be generated in linear time, starting from sI and selecting actions greedily as they become applicable. We assume throughout that G ⊆ R(A) for some set of actions A (which may be the set of all actions in P), i.e., that the goal is relaxed reachable. If it is not, the problem is unsolvable and h =∞. Relevance Analysis For optimal delete-free planning, we only need to consider actions that can achieve a relevant atom for the first time. This enables more aggressive, yet still optimalitypreserving, pruning of irrelevant actions. The standard relevance analysis method in planning is to backchain from the goal: goal atoms are relevant, an action that achieves a relevant atom is relevant, and atoms in the precondition of a relevant action are relevant. We strengthen this analysis in the delete-free case, by considering as relevant only actions that are possible first achievers of a relevant atom. Action a is a possible first achiever of atom p if p ∈ add(a) and a is applicable in a state that is relaxed reachable with only actions that do not add p, i.e., if pre(a) ⊆ R({a | p 6∈ add(a)}). Iterative Minimal Landmark Generation A disjunctive action landmark (landmark, for short) of a problem P is a set of actions such that at least one action in the set must be included in any valid plan for P . Hence, for any collection of landmarks L for P , the set of actions in any valid plan for P is a hitting set for L, i.e., contains an action from each set in L. For any set of actions A such that G 6⊆ R(A), the complement A of A (w.r.t. the whole set of actions) is a disjunctive action landmark for P. If A is an inclusion-maximal such “relaxed non-plan”, A is an inclusion-minimal landmark, since if some proper subset of A were also a landmark, A would not be maximal. This leads to the following algorithm for computing h: Initialise the landmark collection L to ∅. Repeatedly (1) find a minimum-cost hitting set A for L; (2) test if G ⊆ R(A); and if not, (3) extend A, by adding actions, to a maximal set A′ such that G 6⊆ R+(A′), and add the complement of A′ to the landmark collection. The algorithm terminates when the hitting set A contains enough actions to reach the goal and is thus a relaxed plan. There is no lower-cost relaxed plan, because any plan for P must contain an action from every landmark in L, and A is a minimum-cost hitting set. Finally, the algorithm eventually terminates, because every iteration adds a new landmark, which cannot already be in L as it is not hit by A, and there is only a finite set of minimal landmarks for P. The algorithm is given in Figure 1. Details about the procedures NEWLANDMARK (which implements step (3) above) and MINCOSTHITTINGSET are given below in the sections on ‘Testing Relaxed Reachability’ and ‘Finding a Minimum-Cost Hitting Set’ respectively. Improvements to the Algorithm In fact, any set of actions that hits every landmark in L but is not a relaxed plan can be used as the starting point to generate a new landmark, not in L. This observation is the basis for two improvements to the algorithm: First, it is not necessary to find an optimal hitting set in every iteration. Only when the hitting set A is a relaxed plan do we need to verify that its cost is minimal. Thus, we use a fast approximate algorithm to generate hitting sets H , updating A as we go (lines 8–9 of Figure 2: see also next paragraph below), until one that reaches the relaxed goal is found. Only then do we apply the optimal branch-and-bound algorithm, detailed below, using the cost of the non-optimal A as an initial upper bound. After this, A is a cost-minimal hitting set. If it is also 1: L = ∅ 2: A = ∅ 3: while G 6⊆ R(A) do 4: L = L ∪ NEWLANDMARK(A) 5: A = MINCOSTHITTINGSET(L) 6: return A Figure 1: Basic Iterative Landmark Algorithm 1: L = ∅ 2: while G 6⊆ R( ⋃ l∈L l) do 3: L = L ∪ NEWLANDMARK( ⋃ l∈L l) 4: A = MINCOSTHITTINGSET(L) 5: while G 6⊆ R(A) do 6: L = L ∪ NEWLANDMARK(A) 7: H = APXHITTINGSET(L) 8: if G ⊆ R(A ∪H) then A = H 9: else A = A ∪H . 10: if G ⊆ R(A) then A = MINCOSTHITTINGSET(L) 11: return A Figure 2: Iterative Landmark Algorithm with improvements a relaxed plan, it is the solution; if not then the algorithm continues. One candidate for H is the hitting set from the previous iteration extended with the cheapest action in the new (unhit) landmark. Another is the set found by the standard greedy algorithm using the weighted degree heuristic (Chvatal 1979), which greedily adds actions in increasing order of the ratio of their cost to the number of landmarks they hit. We take whichever of these two has lower cost. Second, we collect the union of recently found hitting sets, and first try using this in place of only the last hitting set. If this larger set is also not a relaxed plan, it is used as the starting point to generate a new landmark. If it is a relaxed plan, earlier hitting sets are forgotten and the collection reset to the last hitting set only. The effect of using a larger relaxed non-plan is similar to that of “saturation” used by Bonet & Castillo (2011): it makes new landmarks have less in common with those already in L, thus creating easier hitting set problems and faster convergence to the optimal h value. Taking this idea a step further, as long as the set of all actions appearing in current landmarks is not a relaxed plan, we can use that to generate the next landmark, which will be disjoint from all previous ones. We do this (lines 2 and 3 of Figure 2) to seed L and A before the main loop. Two steps in this algorithm are frequently repeated, and therefore important to do efficiently: the first is testing if the goal is relaxed reachable with a given set of actions, and the second is finding a hitting set with minimum cost. Their implementations are detailed in the following sections. Testing Relaxed Reachability R(A), the set of atoms that are relaxed reachable with actions A, can be computed in linear time by a variant of Dijkstra’s algorithm: Keep track of the number of unreached preconditions of each action, and keep a queue of newly reached atoms, initialised with atoms true in sI . Dequeue one atom at a time until the queue is empty; when dequeueing an atom, decrement the precondition counter for the actions that have this atom as a precondition, and if the counter reaches zero, mark atoms added by the action as reached and place any previously unmarked ones on the queue.1 When generating a new landmark, we perform a series of reachability computations, with mostly increasing sets of actions, i.e. H , H ∪ {a1}, H ∪ {a1, a2}, etc. Therefore, each reachability test can be done incrementally. Suppose we have computed R(A), by the algorithm above, and now wish to compute R(A ∪ {a}). If pre(a) 6⊆ R(A), R(A∪{a}) equals R(A), and the only thing that must be done is to initialise a’s counter of unreached preconditions. If not, mark and enqueue any previously unreached atoms in add(a), and resume the main loop until the queue is again empty. If the goal becomes reachable, we must remove the last added action (a) from the set, and thus must restore the earlier state of reachability. This is done by saving the state of R(A) (including precondition counters) before computing R(A ∪ {a}), and copying it back if needed. Finding a Minimum-Cost Hitting Set Finding a minimum-cost hitting set over the set of landmarks L is an NP-hard problem. We solve it using a recursive branch-and-bound algorithm, with some improvements. When finding a hitting set for {l1, . . . , lm}, we already have an optimal hitting set for {l1, . . . , lm−1}. H({l1, . . . , lm−1}) is clearly a lower bound on H({l1, . . . , lm}), and an initial upper bound can be found by taking H({l1, . . . , lm−1}) + mina∈lm cost(a). These bounds are often very tight, which limits the amount of search. E.g. if lm contains any zero-cost action, the initial upper bound is the lower bound, and is thus optimal. Given a set L = {l1, . . . , lm} of landmarks to hit, we pick a landmark li ∈ L; the minimum cost of a hitting set for L is H(L) = mina∈li H (L−{l | a ∈ l})+cost(a). In our implementation, we pick the landmark whose cheapest action has the highest cost, using landmark size to break ties (preferring smaller). We branch on which action in li to include in the hitting set, cheapest action first. Next, we describe the lower bounds, and three improvements to the basic branchand-bound scheme, that we use. Lower Bounds We use the maximum of two lower bounds on H(L). The first is obtained by selecting a subset L′ ⊂ L s.t. l∩l′ = ∅ for any l, l′ ∈ L′, i.e., a set of pair-wise disjoint landmarks, and summing the costs of their cheapest actions, i.e., ∑ l∈L′ mina∈l cost(a). Finding the set L ′ that yields the maximum lower bound amounts to solving a weighted independent set problem, but there are reasonably good and fast approximation algorithms (e.g. Halldórsson 2000). The second bound is simply the continuous relaxation of the integer programming formulation of the hitting set problem: min ∑

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On a Practical, Integer-Linear Programming Model for Delete-Free Tasks and its Use as a Heuristic for Cost-Optimal Planning

We propose a new integer-linear programming model for the delete relaxation in cost-optimal planning. While a straightforward IP for the delete relaxation is impractical, our enhanced model incorporates variable reduction techniques based on landmarks, relevance-based constraints, dominated action elimination, immediate action application, and inverse action constraints, resulting in an IP that...

متن کامل

A Practical, Integer-Linear Programming Model for the Delete-Relaxation in Cost-Optimal Planning

We propose a new integer-linear programming model for the delete relaxation in cost-optimal planning. While a naive formulation of the delete relaxation as IP is impractical, our model incorporates landmarks and relevance-based constraints, resulting in an IP that can be used to directly solve the delete relaxation. We show that our IP model outperforms the previous state-of-the-art solver for ...

متن کامل

Pruning Methods for Optimal Delete-Free Planning

Delete-free planning underlies many popular relaxation (h+) based heuristics used in state-of-the-art planners; it provides a simpler setting for exploring new pruning methods and other ideas; and a number of interesting recent planning domains are naturally delete-free. In this paper we explore new pruning methods for planning in delete-free planning domains. First, we observe that optimal del...

متن کامل

The Minimal Seed Set Problem

This paper defines and studies a new, interesting, and challenging benchmark problem that originates in systems biology. The minimal seed-set problem is defined as follows: given a description of the metabolic reactions of an organism, characterize the minimal set of nutrients with which it could synthesize all nutrients it is capable of synthesizing. Current methods used in systems biology yie...

متن کامل

Optimal Planning for Delete-Free Tasks with Incremental LM-Cut

Optimal plans of delete-free planning tasks are interesting both in domains that have no delete effects and as the relaxation heuristic h in general planning. Many heuristics for optimal and satisficing planning approximate the h heuristic, which is well-informed and admissible but intractable to compute. In this work, branch-and-bound and IDA∗ search are used in a search space tailored to dele...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012